Scaling & Performance in Backend Systems (Part 3)
Statelessness (Key to Horizontal Scaling)
-
Statelessness = No server instance holds exclusive data.
-
In horizontal scaling:
- Multiple servers run the same code.
- Any request can go to any server.
-
Requirement:
- All servers must behave identically for any request.
Why Statelessness is Important
-
If one server stores unique data:
-
Other servers cannot access it
-
Leads to:
- Errors
- Inconsistent behavior
-
-
Example:
- Server A has user session
- Request goes to Server B → session missing → ❌ failure
Rule of Stateless Systems
- Never store state inside a server instance
- Always store state in shared external systems
Common Stateless Design Patterns
1. Session Management
-
❌ Wrong:
- Store session in server memory
-
✅ Correct:
-
Store session in shared storage:
- Redis (in-memory DB)
-
2. File Storage
-
❌ Wrong:
- Save files on server disk
-
✅ Correct:
-
Use shared object storage:
- S3 / Cloud storage
-
3. Database
-
❌ Wrong:
- Local DB (SQLite on server)
-
✅ Correct:
- Centralized DB (Postgres, MySQL)
Load Balancer (Core Component)
Purpose
- Distributes incoming requests across servers
How It Works
-
Client → Load Balancer → Server → Response → Load Balancer → Client
-
Load balancer decides:
- Which server handles each request
Load Balancing Algorithms
1. Round Robin
-
Requests distributed sequentially:
- A → B → C → A → B → C
Best when:
- Requests are similar
- Servers have equal capacity
Problem with Round Robin
-
Cannot differentiate:
- Light vs heavy requests
-
May overload one server
2. Weighted Round Robin
- Servers get traffic based on capacity
Example:
- Server A (2x capacity) → gets 2x requests
3. Least Connections
-
Sends request to server with:
- Fewest active connections
Better for:
- Mixed workloads (light + heavy requests)
4. Other Algorithms
- Least response time
- Resource-based (CPU/RAM usage)
Handling Server Failures
Problem
- Load balancer may still send traffic to dead server
Solution: Health Checks
-
Load balancer sends periodic test requests
-
If server fails:
- Marked unhealthy
- Removed from routing
-
When server recovers:
- Added back automatically
Database Scaling Challenge
-
Backend scaling is easy (stateless)
-
Database is:
- Stateful
- Harder to scale
Read Replicas
Concept
- One primary DB (handles writes)
- Multiple replicas (handle reads)
Benefits
- Reduces load on primary DB
- Improves latency (geo-distribution)
Request Distribution
- ~70–90% reads → replicas
- Writes → primary
Problem: Replication Lag
- Data replication takes time (e.g., 200 ms)
Issue Example
- User updates name → primary DB
- Immediately fetches data → replica
- Replica not updated yet → stale data ❌
Solutions to Replication Lag
- Route reads to primary after write
- Delay read requests
- Track replication lag
- Frontend delay (controlled fetch timing)
Sharding (Partitioning)
Concept
- Split large table into multiple DB instances
Example
-
Orders table split by:
- Date (Jan–Jun, Jul–Dec)
Benefits
- Smaller datasets → faster queries
- Multiple DB instances → higher throughput
Key Challenge
-
Choosing shard key
-
Example:
- Date
- User ID
-
Distributed Databases (Modern Trend)
Examples
- PlanetScale
- Neon
- CockroachDB
- Yugabyte
Benefits
-
Handle:
- Replication
- Sharding
- Scaling
-
Managed by provider
Practical Advice
-
Don’t build your own DB infra early
-
Use managed services:
- AWS RDS
- GCP SQL
- Neon
CDN (Content Delivery Network)
Purpose
-
Reduce latency caused by:
- Physical distance
Physics Limitation
- Speed of light → ~100 ms minimum latency (long distance)
CDN Solution
- Place servers (edge nodes) near users
Benefits
1. Reduced Latency
-
From:
- ~100 ms → ~2–3 ms
2. Reduced Server Load
- CDN serves cached content
- Origin server gets fewer requests
What to Cache in CDN
1. Static Content
- JS, CSS, HTML
- Images, videos, fonts
2. API Responses
-
Example:
- Product catalog
CDN Cache Invalidation
-
Purge cache using:
- Tags
- Events
CDN for Security
DDoS Protection
-
CDN absorbs malicious traffic
-
Prevents:
- Server crash
- Cost explosion
Edge Computing
- CDN nodes = edge of network
- Located near ISPs
- Serve content instantly
Final Takeaways
-
Statelessness enables horizontal scaling
-
Load balancers distribute traffic intelligently
-
Health checks prevent failures
-
Databases scale via:
- Replication
- Sharding
-
CDNs solve:
- Latency
- Load
- Security